Introduction

The analysis we have done will give you a summary level overview of the U.S. Craft Beers and Breweries dataset, that was supplied to us by the CFO and CEO of Budweiser. The overview will contain breweries per state, summary of report on alcohol and IBU content, and correlation between bitterness and alcohol content. After the presentation, you will walk away with a better understanding on areas to focus and the types of beers that are favorable for the consumers.

Breweries per state?

R-code Explanation:
  • Reads in the datasets into a variable
  • Counts brewery by State and sorts in descending order
  • Mutate the table to create a barplot table
  • Create new column to translate fully spelled out states into abbreviations
  • Merge the brewery count by state with the map_data from R a table. This is needed to create a heatmap.
Analysis Explanation:

The provided heat map and bar plot will show you that Colorado has 47 breweries, the highest in the US. While states like Washington D.C., North Dakota, South Dakota and West Virginia only have 1 brewery.

Table 1: Barplot Brewery Count in Each State
State CountofBreweries
CO 47
CA 39
MI 32
OR 29
TX 28
PA 25
MA 23
WA 23
IN 22
WI 20
NC 19
IL 18
NY 16
VA 16
FL 15
OH 15
MN 12
AZ 11
VT 10
ME 9
MO 9
MT 9
CT 8
AK 7
GA 7
MD 7
OK 6
IA 5
ID 5
LA 5
NE 5
RI 5
HI 4
KY 4
NM 4
SC 4
UT 4
WY 4
AL 3
KS 3
NH 3
NJ 3
TN 3
AR 2
DE 2
MS 2
NV 2
DC 1
ND 1
SD 1
WV 1

Merging the two datasets. Print the first 6 observations and the last six observations to check the merged file.

R-code Explanation:
  • Read in the beer dataset
  • Merge the beer data with the brewery data
  • Create a kable table to output the first and last 6 observations
Analysis Explanation:

In order to get a better idea and analysis on the data we to have merge the U.S. Craft Beers with the Breweries dataset. This allows us to see the breweries in each state, as well as the types of beers, its alcohol content and IBU that each brewery produces. Tables 2 and 3 are an output as a q/a check on the merger of the data. As you can see it lists the brewery in the State along with the details on the beers that it produces.

Table 2: First 6 Observations
Brew_ID Brewery.Name City State Beer.Name Beer_ID ABV IBU Style Ounces
1 NorthGate Brewing Minneapolis MN Pumpion 2689 0.060 38 Pumpkin Ale 16
1 NorthGate Brewing Minneapolis MN Stronghold 2688 0.060 25 American Porter 16
1 NorthGate Brewing Minneapolis MN Parapet ESB 2687 0.056 47 Extra Special / Strong Bitter (ESB) 16
1 NorthGate Brewing Minneapolis MN Get Together 2692 0.045 50 American IPA 16
1 NorthGate Brewing Minneapolis MN Maggie’s Leap 2691 0.049 26 Milk / Sweet Stout 16
1 NorthGate Brewing Minneapolis MN Wall’s End 2690 0.048 19 English Brown Ale 16
Table 3: Last 6 Observations
Brew_ID Brewery.Name City State Beer.Name Beer_ID ABV IBU Style Ounces
556 Ukiah Brewing Company Ukiah CA Pilsner Ukiah 98 0.055 NA German Pilsener 12
557 Butternuts Beer and Ale Garrattsville NY Porkslap Pale Ale 49 0.043 NA American Pale Ale (APA) 12
557 Butternuts Beer and Ale Garrattsville NY Snapperhead IPA 51 0.068 NA American IPA 12
557 Butternuts Beer and Ale Garrattsville NY Moo Thunder Stout 50 0.049 NA Milk / Sweet Stout 12
557 Butternuts Beer and Ale Garrattsville NY Heinnieweisse Weissebier 52 0.049 NA Hefeweizen 12
558 Sleeping Lady Brewing Company Anchorage AK Urban Wilderness Pale Ale 30 0.049 NA English Pale Ale 12

Reporting out NA’s in each column

R-code Explanation:
  • Calculating the number of NA’s in each column
  • Creating bar chart for each NA count
Analysis Explanation:

Further analysis of the data showed that there are missing values in the U.S. craft beer dataset. The below chart shows you the count from each relevant columns. Analysis is only done with the values on-hand.

Summary Analysis

R-code Explanation:
  • Calculate overall summary statistics on the original dataset; mean, median, max, min and 75th quantile
  • Create interactive line graph
Analysis Explanation:

It’s important to understand this because if we just look at the maximum ABV and IBU levels. The perception is that Colorado only produce beers with high alcohol content or Oregon only produce beers that are really bitter. It’s the combination of the bar graph and Figure 3 that you start see that Colorado doesn’t exclusively produces beers with high alcohol content, but instead they make a couple of beers with high ABV. That’s why you see the mean of ABV is being pulled to the higher levels, since it is not resistant to outliers.

States that have the maximum alcoholic beer or the most bitter beer.

R-code Explanation:
  • Calculating the median ABV and IBU for each state and ignoring the NA’s
  • Creating a interactive barchart for the forementioned calculation for each state
Analysis Explanation:

It’s important to look at the maximum, means and medians of the data, so you are able to understand what the consumer wants are for each state. If you look at the graph below you will see the median results for alcohol and bitterness content by state. Here you will start see the different combination of results from ABV to IBU. For example, Washington D.C. has the highest median value of ABV, but Maine has the highest median value for IBU. Is there a correlation?

Computing the Median Alcohol Content (ABV) and International Bitterness Unit (IBU) for each state.

R-code Explanation:
  • Calculating the median ABV and IBU for each state and ignoring the NA’s
  • Creating a interactive barchart for the forementioned calculation for each state
Analysis Explanation:

It’s important to look at the maximum, means and medians of the data, so you are able to understand what the consumer wants are for each state. If you look at the graph below you will see the median results for alcohol and bitterness content by state. Here you will start see the different combination of results from ABV to IBU. For example, Washington D.C. has the highest median value of ABV, but Maine has the highest median value for IBU. Is there a correlation?

Is there an apparent relationship between the bitterness of the beer and its alcoholic content?

R-code Explanation:
  • Create a interacitve scatterplot with the ABV and IBU data
Analysis Explanation:

Figure 8 is where you see the correlation of a high ABV and its IBU counterpart. As the alcohol content level gets higher, the IBU level tends to go up. There are some outliers where the highest ABV at .125 does not have the highest IBU. The importance here is to look at where the clustering is happening. This is your indication to produce a beer that may have a favorable outcome to the majority of the consumers.

Conclusion

The best action to take when trying to compete with breweries is to understand the amount of different beers and breweries are in each state. Figure 9 gives you that view. Hit the states with a low brewery and beer type count. Also, look at the clustering in Figure 8 to produce the most favorable beer. For instance, most of the clustering happens to be around ABV levels .04 to .06 and IBU levels of 20 to 40. This could mean that its highly in demand or breweries produce it because it’s cheap to make. Further analysis with additional data points like taste type would needs to be done.